Aligning sequences with repetitive motifs

نویسندگان

  • Peter Kovác
  • Brona Brejová
  • Tomás Vinar
چکیده

Pairwise sequence alignment is among the most intensively studied problems in computational biology. We present a method for alignment of two sequences containing repetitive motifs. This is motivated by biological studies of proteins with zinc finger domain, an important group of regulatory proteins. Due to their evolutionary history, sequences of these proteins contain a variable number of different zinc fingers (short subsequences with specific symbols at each position). Our algorithm uses two types of hidden Markov models (HMM): pair HMMs and profile HMMs. Profile HMMs describe the structure of sequence motifs. Pair HMMs assign a probability to alignment of two motifs. Combination of the these two types of models yields an algorithm that uses different score when aligning conserved vs. variable motif residues. The dynamic programming algorithm that computes the motif alignments is based on the well known Viterbi algorithm. We evaluated our model on sequences of zinc finger proteins and compared it with existing alternatives.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of repetitive amino acid motifs reveals the essential features of spider dragline silk proteins

The extraordinary mechanical properties of spider dragline silk are dependent on the highly repetitive sequences of the component proteins, major ampullate spidroin 1 and 2 (MaSp2 and MaSp2). MaSp sequences are dominated by repetitive modules composed of short amino acid motifs; however, the patterns of motif conservation through evolution and their relevance to silk characteristics are not wel...

متن کامل

Small, repetitive DNAs contribute significantly to the expanded mitochondrial genome of cucumber.

Closely related cucurbit species possess eightfold differences in the sizes of their mitochondrial genomes. We cloned mitochondrial DNA (mtDNA) fragments showing strong hybridization signals to cucumber mtDNA and little or no signal to watermelon mtDNA. The cucumber mtDNA clones carried short (30-53 bp), repetitive DNA motifs that were often degenerate, overlapping, and showed no homology to an...

متن کامل

A study of the repetitive structure and distribution of short motifs in human genomic sequences

Over the last several years the search for functional genomic elements by exploiting motif over-representation became increasingly popular. However, about half of the human genome is repetitive, and that is also the case with most higher eukaryotes. In this study we have shown that in addition to these known repeats, human sequences feature many short over-represented motifs, and that their fre...

متن کامل

Functional motifs in Escherichia coli NC101

Escherichia coli (E. coli) bacteria can damage DNA of the gut lining cells and may encourage the development of colon cancer according to recent reports. Genetic switches are specific sequence motifs and many of them are drug targets. It is interesting to know motifs and their location in sequences. At the present study, Gibbs sampler algorithm was used in order to predict and find functional m...

متن کامل

The roles of EPIYA sequence to perturb the cellular signaling pathways and cancer risk

Abstract It was shown that several pathogenic bacterial effector proteins contain the Glu-Pro-Ile-Tyr-Ala (EPIYA) or a similar sequence. These bacterial EPIYA effectors are delivered into host cell via type III or IV secretion system, where they undergo tyrosine phosphorylation at the EPIYA sequences, which triggers interaction with multiple host cell SH2 domain-containing proteins and thereby...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012